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Abstract — This paper considers a simple on-off random multi- 
ple access channel, where n users communicate simultaneously to 
a single receiver over m degrees of freedom. Each user transmits 
with probability A, where typically Xn < m <^ n, and the 
receiver must detect which users transmitted. We show that when 
the codebook has i.i.d. Gaussian entries, detecting which users 
transmitted is mathematically equivalent to a certain sparsity 
detection problem considered in compressed sensing. Using recent 
sparsity results, we derive upper and lower bounds on the 
capacities of these channels. We show that common sparsity de- 
tection algorithms, such as lasso and orthogonal matching pursuit 
(OMP), can be used as tractable multiuser detection schemes and 
have significantly better performance than single-user detection. 
These methods do achieve some near-far resistance but — at high 
signal-to-noise ratios (SNRs) — may achieve capacities far below 
optimal maximum likelihood detection. We then present a new 
algorithm, called sequential OMP, that illustrates that iterative 
detection combined with power ordering or power shaping can 
significantly improve the high SNR performance. Sequential 
OMP is analogous to successive interference cancellation in the 
classic multiple access channel. Our results thereby provide 
insight into the roles of power control and multiuser detection 
on random-access signalling. 

Index Terms — compressed sensing, convex optimization, lasso, 
maximum likelihood estimation, multiple access channel, mul- 
tiuser detection, orthogonal matching pursuit, power control, 
random matrices, single-user detection, sparsity, thresholding 



I. Introduction 

In wireless systems, random access refers to any multiple 
access communication protocol where the users autonomously 
decide whether or not to transmit depending on their own 
traffic requirements and estimates of the network load. While 
random access is best known for its use in packet data commu- 
nication in wireless local area networks (LANs) [1], this paper 
considers random access for simple on-off messaging. On-off 
random access signaling can be used for a variety of control 
tasks in wireless networks such as user presence indication, 
initial access, scheduling requests and paging. Random on- 
off signaling is already used for some of these tasks in current 
cellular systems [2], [3] 
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The limits of on-off random access signaling with multiple 
users are not fully understood. To this end, we consider a 
simple random multiple access channel where n users transmit 
to a single receiver Each user is assigned a single codeword 
which it transmits with probability A. We wish to understand 
the capacity of these channels, by which we mean the total 
number of degrees of freedom m needed to reliably detect 
which users transmit as a function of n. A, and the channel 
conditions. We also wish to establish performance bounds for 
specific decoding algorithms. 

This on-off random access channel is related to the classic 
multiple access channel (MAC) in network information theory 
[4], [5]. The theory of the MAC channel is well understood 
[6]-[9] and has been applied in commercial CDMA systems 
[10]. Unfortunately, it is difficult to apply the classic MAC 
channel analysis directly to the on-off random access channel 
under consideration here. 

In the traditional analysis of the MAC channel, the number 
of users remains constant, while the number of degrees of 
freedom of the channel goes to infinity. As a result, each 
user can employ a capacity-achieving code with an infinite 
block length. However, in the on-off random access channel 
considered here, as the number of degrees of freedom of the 
channel is increased, the goal is not to scale the number of bits 
per user, but rather the total number of users. Since each user 
only transmits at most one bit of information, channel coding 
cannot be used for reliability, and the classic MAC capacity 
results do not apply. 

Our analysis is instead based on identifying a connection 
between the on-off random access channel and the recovery 
of the sparsity pattern of a signal from noisy random linear 
measurements. The feasibility of recovering sparse, approxi- 
mately sparse, or compressible signals from a relatively small 
number of random linear measurements has recently been 
termed compressed sensing [11]-[13]. When the users in the 
on-off random access channel employ certain large random 
codebooks, we show that the problem at the receiver of 
detecting the active users is precisely the sparsity detection 
problem addressed in several recent works in the compressed 
sensing literature [14]-[17]. 

Results in compressed sensing generally provide bounds on 
the estimation error of a signal as a function of the number 
of measurements, the signal sparsity and other factors. How- 
ever, what is relevant for the random on-off multiple access 
channel is detecting the positions of the nonzero entries. This 
problem arises in subset selection in linear regression [18]. 

By exploiting recent compressed sensing results and pro- 
viding an analysis of a new algorithm, we are able to provide 
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a number of insights: 

• Performance bounds with ML detection: Recent results in 
[14], [15], [19] provide simple upper and lower bounds on 
the number of measurements required to detect the users 
reliably assuming maximum likelihood (ML) detection. 
One of the consequences of these bounds is that, unlike 
the classic MAC channel, the sum rate achievable with 
random access signaling can be strictly less than the rate 
achievable with coordinated transmissions with the same 
total power 

• Potential gains over single-user detection: ML detection 
can be considered as a type of multiuser detection. 
Current commercial designs, however, almost universally 
use simple single-user detection (see, for example [20] 
for a typical WCDMA design). The single-user detection 
performance can be estimated by bounds given in [15], 
[21]. The bounds show that ML detection offers a poten- 
tially large gain over single-user detection, particularly 
at high SNRs. The gap at high SNRs can be explained 
by a certain self-noise limit experienced by single-user 
detection. 

• Lasso- and OMP-based multiuser detection and near- 
far resistance: ML sparsity detection is a well-known 
NP-hard problem [22]. However, there are practical, but 
suboptimal, algorithms such as the orthogonal matching 
pursuit (OMP) [23]-[26] and "lasso" [27] methods in 
sparse estimation that can be used for multiuser de- 
tection methods for the on-off random access channel. 
In comparison to single-user detection, we show that 
these methods can offer improved performance when the 
dynamic range in received power levels is large. This 
near-far resistance feature is similar to that of standard 
MMSE multiuser detection in CDMA systems [28]. 

• Improved high SNR performance with power shaping: 
While both lasso and OMP offer improvements over 
single-user detection, there is still a large gap in the 
performance of these algorithms in comparison to ML 
detection at high SNRs. Specifically, at high SNRs, ML 
achieves a fundamentally different scaling in the number 
of measurements required for reliable detection than that 
required by lasso, OMP and single-user detection. 

We show, however, that when accurate power control is 
available, the ML scaling can be theoretically achieved 
with a simplified version of OMP, which we call sequen- 
tial OMP (SeqOMP). The method is analogous to the 
classic successive interference cancellation (SIC) method 
for the MAC channel. Specifically, users are deliberately 
targeted at different received power levels and then de- 
tected and cancelled out in descending order of power 
While SeqOMP shows significant gains over single-user 
detection, for most practical problem sizes it does worse 
than standard OMP, even without power shaping. How- 
ever, we show, at least by simulation, that power shaping 
can improve the performance of OMP as well. 

The connection between sparsity detection methods such as 
OMP and the SIC technique for the MAC channel has also 
been observed in the recent work of Jin and Rao [29]. A related 



work by Wipf and Rao [30] also gave some empirical evidence 
for the benefit of power shaping when used in conjunction 
with sparse Bayesian learning algorithms. Both the works [29] 
and [30] are discussed in more detail below. The results in 
this paper make the connections between sparsity detection 
and the random access MAC channel more precise by giving 
concrete conditions on the detectability of the sparsity pattern, 
characterizing the optimal power shaping distribution, and 
contrasting the classic MAC and on-off random access MAC 
capacities. 

The remainder of the paper is organized as follows. The 
setting is formalized in Section Ull In particular, we define 
all the key problem parameters. Results that can be derived 
from existing necessary and sufficient conditions for spar- 
sity pattern recovery are then presented in Section |III1 We 
will see that there is a potentially-large performance gap 
between single-user detection and the optimal ML detection. 
Existing "practical" multiuser detection techniques perform 
significantly better than single-user detection in that they are 
near-far resistant. However, their performance saturates at high 
SNRs, falling well short of ML detection. Section ITV] presents 
a new detection algorithm, sequential orthogonal matching 
pursuit (SeqOMP), that has near-far resistance under certain 
assumptions on power control. Furthermore, with optimal 
power shaping, it does not suffer from saturation at high SNRs. 
Numerical experiments are reported in Section |Vl Connection 
to MAC capacity are discussed in Section [VTl conclusions are 
given in Section lVIll and proofs are relegated to the Appendix. 

II. On-Off Random Access Channel Model 
A. Problem Formulation 

Assume that there are n transmitters sharing a wireless 
channel to a single receiver. Each user j is assigned a unique, 
dedicated codeword represented as an m-dimensional vector 
a.j E C", where m is the total number of degrees of freedom 
in the channel. By degrees of freedom we simply mean the 
dimension of the received vector, which represents the number 
of samples in time or frequency depending on the modulation. 
In any channel use, only some fraction of the users, A G (0, 1), 
transmit their codeword. The fraction A will be called the 
activity ratio and any user that transmits will be called active. 

The signal at the receiver from each user j is modeled as 
XjBij where Xj is a complex scalar. If the user is not active, 
Xj — 0. If the user is active, Xj would represent the product 
of the transmitted symbol and channel gain. The total signal 
at the receiver is given by 

n 

y = ^ ajXj + w = Ax + w, (1) 

where w S C™ represents noise. The matrix A € c^xn 
formed by codewords a.j, 

A = [ai • • • a„] , 

and will be called the codebook. The vector x = [xi ■ ■ ■ a;„]^ 
will be called the modulation vector, and its components 
{^jVj^i ^6 referred to as the received modulation symbols. 
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Given a modulation vector x, define the active user set as 



^truc = { j -.Xj ^0} , 



(2) 



which is the "true" set of active users. The size of the active 
user set is related to the activity ratio through 



A = -I/, 



(3) 



The goal of the receiver is to determine an estimate / = /(y) 
of /true based on the received noisy vector y. 

For the most part, we will be interested in estimators that 
exploit minimal prior knowledge of the modulation vector 
X other than it being sparse. In particular, we will limit 
our attention to estimators that do not explicitly require a 
priori knowledge of the complex modulation symbols Xj. 
This assumption is required since the channel gain is typically 
unknown at the receiver in random access channels, since users 
conducting random access communication would be unlikely 
to be sending any other persistent pilot reference. 

We consider large random codebooks where the entries of 
A are i.i.d. CJ\f{Q, 1/m). We assume the noise vector is also 
Gaussian: w ~ CA/'(0, (l/m)/,„). Given an estimator, / = 
/(y), the probability of error, 

Pcrr = Pr (/ /true) , (4) 

is taken with respect to random codebook A, the noise vector 
w, and the statistical distribution of the modulation vector x. 
We want to find estimators / that bring pcn close to zero. 

We will see that two key factors influence the ability to 
detect the active user set. The first is the total SNR defined as 

E||Ax||2 

SNR = ' I . 5 

E||w||^ 

Since the components of the matrix A and noise vector w are 
i.i.d. CJ\f{Q, 1/m), it can be verified that, for deterministic x. 



SNR 



(6) 



In the case of random x, this expression is the conditional 
SNR given x; we will have both deterministic and random 
formulations. 

The second term is what we will call the minimum-to- 
average ratio 

min.,cr, IxoP 
MAR = ^ 'r:. ^ . (7) 



||x||2/An ■ 

Since Itmc has An elements, ||x|p/An is the average of 
{Nj P I i e /true}- Therefore, MAR e (0,1] with the upper 
limit occurring when all the nonzero entries of x have the 
same magnitude. MAR is a deterministic quantity when x is 
deterministic and a random variable otherwise. 

One final value that will be important is the minimum 
component SNR, which, for a given x, is given by 

SNRinin = / min F,\\ajXjf = min \xjf, (8) 

where a.j is the jth column of A. The quantity SNRmin has 
a natural interpretation: The numerator, minEHa^XjlP is the 
signal power due to the smallest nonzero component in x. 



while the denominator, E||w||2, is the total noise power The 
ratio SNR„iin thus represents the contribution to the SNR from 
the smallest nonzero component of the unknown vector x. 
The final equality in dD is a consequence of the fact that 
1. Observe that ^ and (|7|i show 



E||a,||2 



E||w,|p 



SNR„ 



mm I Xj 



1 

— SNR - MAR. 

An 



(9) 



B. MAR and Power Control 

For wireless systems, the factor MAR in (|7]i has an important 
interpretation as a measure of the dynamic range of received 
power levels. With accurate power control, all users can be 
controlled to arrive at the same power In this case, MAR = 1. 
However, if power control is difficult due to fading or lack of 
power control feedback, there can be a considerable dynamic 
range in the received powers from different users. In this case, 
some users could arrive at powers much below the average 
making MAR closer to zero. 

One of the results in this paper is a precise quantification of 
the effect of MAR on the detectability of the active user set. 
Specifically, we will show that low MAR can make reliable 
detection significantly more difficult for certain algorithms. 
The problem is analogous to the well-known near-far effect 
in CDMA systems [28], where users with weak signals can 
be dominated by higher-power signals. 

C. Synchronization and Multi-Path 

It is important to recognize that an implicit assumption in 
the above model is that the transmissions from different users 
are perfectly synchronized. At a minimum, the timing offsets 
from the users are exactly known at the receiver and there is 
no multipath. 

Of course, in many wireless applications, exact synchroniza- 
tion is not possible and the receiver must estimate the timing 
delay of the transmission as part of the detection process. 
In most practical receivers, timing offsets are estimated by 
discretizing the delay search space, typically to a quarter or 
half-chip resolution. The receiver then searches over a finite 
set of delay hypotheses depending on the range of timing 
uncertainty. In the presence of multipath, the receiver could 
detect multiple delay hypotheses. 

To model this search in the theoretical framework of this pa- 
per, we would need to model each timing shift of the codeword 
as a different codeword. The total number of codewords would 
then grow to the number of users times the number of delay 
hypotheses per user. While the algorithms we will present 
can be applied in this manner to deal with the asynchronous 
case, there are several theoretical issues with extending the 
analysis. In particular, this extended codebook would lack the 
independence of codewords that the simpler model has by 
construction. We will thus just consider only the synchronous 
case for the remainder of this paper. 

III. Performance with Current Sparsity 
Detection Methods 

The problem of detecting the active user set is precisely 
equivalent to a sparsity pattern recovery problem. To see this. 
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note that the modulation vector x is sparse, with nonzero 
components only in positions corresponding to the active 
users. The problem at the receiver is to detect these nonzero 
positions in x from noisy linear observations y in ([T]). 

In this section, we develop asymptotic analyses for detection 
of the active users based on previous results on sparsity pattern 
recovery. We model x as deterministic, so the quantities A, 
SNR, MAR and SNR,nin are also deterministic. Since our 
formulation allows simple translation of results from [14]- 
[17], we state these translations without detailed justifications. 
Several results are here adjusted by a factor of two because 
we have complex, rather than real, measurements. 

Our results are expressed as scaling laws on the number of 
measurements for asymptotic reliable detection of the active 
user set. We define this as follows: 

Definition 1: Suppose that we are given deterministic se- 
quences m — m{n) and x = x(7i) G C" that vary with n. 
For a given detection algorithm / ~ I{y), we then define the 
probabiUty of error pcn in (|4]i where the probability is taken 
over the randomness of the codebook A and the noise vector 
w. Given the number of measurements m{n) and modulation 
vector x(n), the probability of error wiU then simply be a 
function of n. We say that the detection algorithm achieves 
asymptotic reliable detection when Pcf[{n) 0. 

Table J] summarizes the results from this section and pre- 
views results from Section HV] 



Since the noise w is Gaussian, the ML detector finds the k- 
dimensional subspace spanned by k columns of A containing 
the maximum energy of y. 

The ML estimator was first analyzed by Wainwright [14]. 
The results in that work, along with the fact that k = An, 
show that there exists a constant C > such that if 



m > C max 
= C max 



1 



MAR • SNR 



Anlog(n(l - A)),Anlog(l/A) 



\SNR„ 



log(n(l- A)),Anlog(l/A) ^ (11) 



then ML will asymptotically detect the correct active user set. 
The equivalence of the two expressions in (fTTT l is due to (|9]). 
Also, [15, Thm. 1] (generalized in [19, Thm. 1]) shows that, 
for any S > 0, the condition 



m > An log(ri(l — A)) + An, 

- MAR -SNR ^ " ' 

= qIir'^ log(n(l - A)) + An, 



(12) 



is necessary. Observe that when SNR • MAR oo, the lower 
bound (fTzl i approaches rn > An, matching the noise free case 
([Tol l as expected. 

These necessary and sufficient conditions for ML appear in 
Table I] with smaller terms and the infinitesimal 6 omitted for 
simplicity. 



A. Optimal Detection with No Noise 

To understand the limits of detection, it is useful to first 
consider the minimum number of measurements when there 
is no noise. Since the activity ratio is A, x will have k — Xn 
nonzero components. For a lower bound on the minimum num- 
ber of measurements needed for reliable detection, suppose 
that the receiver knows the number of active users k as side 
information. 

With no noise, the received vector is y = Ax, which 
will belong to one of J = (^) subspaces spanned by k 
columns of A. If m > k, then these subspaces will be distinct 
with probability 1. Thus, an exhaustive search through the 
subspaces will reveal which subspace y belongs to and thus 
determine the active user set. This shows that with no noise 
and no computational limits, the scaling in measurements of 



m > An 



(10) 



is sufficient for asymptotic reliable detection. 

Conversely, if no prior information is known at the receiver 
other than x being fc-sparse, then the condition ( fTOl i is also 
necessary. If m < A; = An, then for almost all codebooks A, 
any k columns of A span C™. Consequently, any received 
vector y = Ax is consistent with any k users transmitting. 
Thus, the active user set cannot be determined without further 
prior information on the modulation vector x. 

B. ML Detection with Noise 

Now suppose there is noise. Since x is an unknown deter- 
ministic quantity, the probability of error in detecting the active 
user set is minimized by maximum likelihood (ML) detection. 



C. Single User Detection 

The most common and simple method to detect the active 
user set is a single-user detection estimator of the form. 



ISUD = {.]■■ P{j) > M } , 



(13) 



where /i > is a threshold parameter and p{j) is the 
correlation coefficient. 



Pij) 



(14) 



Single-user detection has been analyzed in the compressed 
sensing context in [15], [21], [31]. A small modification of 
[15] shows the following result: Suppose, 



m(n) > 

where i5 > and 

L(A,n) = 



(l + 5)L(A,n)(l + SNR) 

SNR • MAR 

(l + (5)i(A,n)(l + SNR) 

SNRmin 



An, 



v/log(n(l - A)) + v/log(nA) 



(15) 



(16) 



Then there exists a sequence of detection thresholds /i — p{n) 
such that single-user detection achieves asymptotic reliable 
detection of the active user set. As before, the equivalence 
of the two expressions in ( fTSl l is due to (|9]). 

Comparing the sufficient condition dTSI ) for single-user de- 
tection with the necessary condition (fTSl l. we see two distinct 
problems in single-user detection: 
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finite SNR ■ MAR 


SNR ■ MAR oo 


Necessary for ML 


> MAR.SNR^"'°S("(1 ■^)) 

Fletcher et al. [15, Thm. 1] 


TTL > \n 
(elementary) 


oumcient tor JVLL 


™ > MAR.SNR^"l°g("(l 

Wain Wright [14] 


m > An 
(elementary) 


Sufficient for sequential 
OMP with power shaping 


log(l + SNR)^"'°g("(^ ^)) 

From Theorem [T] (Section |IV-E| 


m > 5Xn 
From Theorem [T] (Section |IV-F| 


Necessary and 
sufficient for lasso 


unknown (expression to 
the right is necessary) 


m > Anlog(n(l — A)) 
Wain Wright [16] 


Sufficient for 
OMP 


unknown 


Tn, ^ 2A7ilog^Ti) 
Tropp and Gilbert [17] 


Sufficient for single 
user detection (13) 


™>^^Anlog(n(l-A)) 
Fletcher et al. [15, Thm. 2] 





TABLE I 

Summary of results on measurement scalings for asymptotic reliable detection for various detection algorithms . 
Only leading terms are shown. See body for definitions and additional technical limitations. 



Constant offset: The scaling (fTsT i for single-user detection 
shows a factor L{X,n) instead of log((l — X)n) in ( fT2] i. 
It is easily verified that, for A e (0, 1/2), 

log((l - X)n) < L(A, n) < 4 log((l - A)n), (17) 

so this difference in factors alone could require that 
single-user detection use up to four times more measure- 
ments than ML for asymptotic reliable detection. 
Combining the inequality ( fTTj i with (fTSl l, we see that the 
more stringent, but simpler, condition 



m{n) > 



(1 + (5)4(1 + SNR) 



Anlog((l - A)n) (18) 



SNR • MAR 

is also sufficient for asymptotic reliable detection with 
single-user detection. This simpler condition is shown 
in Table |I] where we have omitted the infinitesimal 6 
quantity to simplify the table entry. 
Self noise limit: In addition to the L(A, n) / log(ri(l — A) 
offset, single-user detection also requires a factor of 
1 + SNR more measurements than ML. This 1 + SNR 
factor has a natural interpretation as self-noise: When 
detecting any one component of the vector x, single- 
user detection sees the energy from the other n — 1 
components of the signal as interference. We can think 
of this additional noise as self-noise, by which we mean 
the interference caused from different components of the 
signal X interfering with one another in the observed 
signal y through the measurement matrix A. This self- 
noise is distinct from the additive noise w. This self-noise 
increases the effective noise by a factor of 1+SNR, which 
results in a proportional increase in the minimum number 
of measurements. 

This self-noise results in a large performance gap at high 
SNRs. In particular, as SNR — > oo, ( fTSl ) reduces to 



m{n) > 



(l + (5)L(A,n) 
MAR 



In contrast, ML may be able to succeed obtain with a 
scaling m — 0{Xn) for high SNRs, which is fundamen- 
tally better than the m ~ il(Anlog((l — X)n) required 
by single-user detection. 

D. Lasso and OMP Estimation 

While ML has clear advantages over single-user detection, it 
is not computationally feasible. However, one practical method 
used in sparse signal estimation is the lasso estimator [27], 
also called basis pursuit denoising [32]. In the context of the 
random access channel, the lasso estimator would first estimate 
the modulation vector x by solving the convex minimization 



X = argmin (||y - Axjjj + /i||x||i) 



(20) 



where /i > is an algorithm parameter that "encourages" 
sparsity in the solution x. The nonzero components of x can 
then be used as an estimate of the active user set. 

The exact performance of lasso is not known at finite 
SNR. However, Wainwright [16] has exactly characterized 
the conditions for lasso to work in the high SNR regime. 
Specifically, if m, n and An oo, with SNR • MAR — > oo, 
the scaling 

m > Anlog(n(l - A)) + An + 1, (21) 

is both necessary and sufficient for asymptotic sparsity recov- 
ery. 

Another common approach to sparsity pattern detection is 
the greedy OMP algorithm [23], [25], [26]. This has been 
analyzed by Tropp and Gilbert [17] in a setting with no noise. 
They show that, when A has Gaussian entries, a sufficient 
condition for asymptotic reliable recovery is 



m > 2Anlog(n) + (7 An, 



(22) 



Anlog((l - A)n). (19) 



where (7 > is a constant. Numerical experiments reported 
in [17] suggest that the constant factor 2 may be removed. 
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although this has not be proven. In any case, OMP with 
no noise has a similar scaling in the sufficient number of 
measurements as lasso. 

The conditions (ISTT i and (l22l i are both shown in Table |T] As 
usual, the table entries are simplified by including only the 
leading terms. 

The lasso and OMP scaling laws, i2T[ and ( |22] |. can be 
compared with the high SNR limit for the single-user detection 
scaling law in ( fT9l l. This comparison shows the following: 

• Removal of the constant offset: The L{X,n) term in the 
single-user detection expression ( fT9l ) is replaced by a 
log(n(l — A)) term in the lasso scaling law (1211 1 and 
2 log(n) for the OMP scaling law ( |22] |. Similar to the 
discussion above, this implies that lasso could require up 
to 4 times fewer measurements than single-user detection. 
OMP could require 2 times fewer. 

• Near-far resistance: In addition, both the lasso and OMP 
methods do not have a dependence on MAR; thus, in the 
high SNR regime, they have a near-far resistance that 
single-user detection does not. This gain can be large 
when there are users whose received powers are much 
below the average (low MAR). 

The near-far resistance of lasso and OMP is analogous 
to that of MMSE multiuser detection in CDMA systems 
[28]. In that case, when the number of degrees of freedom 
m exceeds the number of users n, a decorrelating detector 
can null out strong users while recovering weak ones. An 
interesting property that we see in the random access case 
is that near-far resistance may be possible when m < n, 
provided that m is sufficiently greater than the number 
of active users. An. 

• Limits at high SNR: We also see from ( |2TI ) and ( |22] | that 
both lasso and OMP are unable to achieve the scaling 
m = 0{\n) that may be achievable with ML at high 
SNR. Instead, both lasso and OMP have the scaling, 
m — 0(Anlog((l — A)n)), similar to the minimum 
scaling possible with single-user detection, which suffers 
from a self-noise limit. 



E. Other Sparsity Detection Algorithms 

Recent interest in compressed sensing has led to a plethora 
of algorithms beyond OMP and lasso. Empirical evidence 
suggests that the most promising algorithms for sparse pattern 
detection are the sparse Bayesian learning methods developed 
in the machine learning community in [33], and introduced 
into signal processing applications in [34], with related work 
in [35]. Unfortunately, a comprehensive summary of these 
algorithms is far beyond the scope of this paper. 

Instead, we will limit our discussion to the lasso and OMP 
methods since these are the algorithms with the most concrete 
analytic results on asymptotic reliable detection. Moreover, 
our interest is not in finding the optimal algorithm, but merely 
to point out general qualitative effects such as near-far and 
self-noise limits which should be considered in evaluating any 
algorithm. 



IV. Sequential Orthogonal Matching Pursuit 

The analyses in the previous section suggest that ML detec- 
tion may offer significant gains over the provable performance 
of current "practical" algorithms such as single-user detection, 
lasso and OMP, when the SNR is high. Specifically, as the 
SNR increases, the performance of these practical methods 
saturates at a scaling in the number of measurements that can 
be significantly higher than that for ML. 

In this section, we show that if accurate power control is 
available, an OMP-like algorithm, which we call sequential or- 
thogonal matching pursuit or SeqOMP, can break this barrier 
Specifically, the performance of SeqOMP does not saturate at 
high SNR. 

A. Algorithm 

Algorithm 1 (SeqOMP): Given a received vector y and 
threshold level /i > 0, the algorithm produces an estimate 
-^SOMP of the active user set with the following steps: 

1) Initialize the counter j = 1 and set the initial active user 
set estimate to empty: /(O) = {0}. 

2) Compute P(j)aj where P(j) is the projection operator 
onto the orthogonal complement of the span of {a^, £ G 

3) Compute the correlation. 



p{j) 



|a'PO-)yr 



||PO-)a,|P||P(j)y|| 



(23) 



4) If p{j) > fi, add the index j to I{j - 1). That is, ^ 
i{j - 1) U {j}. Otherwise, set = i{j - 1). 

5) Increment j = j + 1- If j < n return to step 2. 

6) The final estimate of the active user set is /somp = 
i{n). 

The SeqOMP algorithm can be thought of as an iterative 
version of single-user detection with the difference that, after 
an active user is detected, subsequent correlations are per- 
formed only in the orthogonal complement to the detected 
codeword. The method is identical to the standard OMP 
algorithm of [23], [25], [26], except that SeqOMP passes 
through the data only once. For this reason, SeqOMP is 
actually computationally simpler than standard OMP. 

As simulations will illustrate later, SeqOMP generally has 
much worse performance than standard OMP. It is not in- 
tended as a competitive practical alternative. Our interest in 
the algorithm lies in the fact that we can prove positive 
results for SeqOMP. Specifically, we will be able to show that 
this relatively poor algorithm, when used in conjunction with 
power shaping, can achieve a fundamentally better scaling 
at high SNRs than what has been proven is achievable with 
methods such as OMP. We will also provide some simulation 
evidence that OMP can also benefit somewhat from power 
shaping, although we will not be able to prove this here. 

B. Sequential OMP Performance 

The analysis in Section |lll] was based on deterministic vec- 
tors X. To characterize the SeqOMP performance, it is simpler 
to use a partially-random model where the active user set 
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is random while the received modulation signal power \xj\'^, 
conditioned on user j being active, remains deterministic. We 
reuse the notation A because its meaning remains almost the 
same. 

We assume that each user is active with some probability 
A S (0, 1), which we now call the activity probability . The 
activities of different users are assumed to be independent. 
Thus, unlike in Section |llll An represents the average number 
of users that are active, as opposed to the actual number 

Let pj denote the received modulation symbol power 



Pj 



(24) 



conditional that user j is active. We will call the set 
the power profile, which we will treat as a deterministic 
quantity. Since each user transmits with a probability A, the 
total average SNR is given by. 



SNR = A^pj. 



(25) 



This factor is also deterministic. 

Given a power profile, we will see that a key parameter in 
estimating the performance of the SeqOMP algorithm is what 
we will call the minimum signal-to-interference and noise ratio 
(SINK) defined as 



7 



min pe/a {£), 

£ — l,....7l 



where (t^(£) is given by 



(26) 



(27) 



j=i+i 



The parameters 7 and a'^{£) have simple interpretations: 
Suppose that the SeqOMP algorithm has correctly decoded 
all the users for j < £. Then, in detecting the £th user, the 
receiver sees the noise w with power E||w|p = 1 and, for 
each user j > I, an interference power pj with probability A. 
Hence, ct^ {£) is the total average interference power seen when 
detecting ^th user, assuming perfect cancellation. Since user I 
arrives at a power pf, the ratio pija'^iV) in ( l26l ) represents the 
average SINR seen by user I. The value 7 is the minimum 
SINR over all n users. 

Theorem 1: Let A ~ A(n), m = m(n) and the power 
profile {pjVj^x — be deterministic quantities 

that all vary with n satisfying the limits m — An, An and 
(1 — A)n 00, and 7—^0. Also, assume the sequence of 
power profiles satisfies the hmit 



lim max 

n — ^00 i—\ n- 



log(n)a-^z)Vp2^0. 



Finally, assume that for all n, 

(l + (5)L(n,A) 



m > 



An, 



7 



(28) 



(29) 



for some 5 > and £(n, A) defined in (fT&t . Then, there exists 
a sequence of thresholds, [i — p(n), such that SeqOMP will 



achieve asymptotic reliable detection of the active user set in 
that 

Poir = Pr (i'sOMP 7^ ^truc) 0, 



where the probability is taken over the randomness in the 
activities of the users, the codebook A, and the noise w. The 
sequence of threshold levels can be selected independent of 
the sequence of power profiles. 

Proof: See Appendix lAl ■ 
The theorem provides a simple sufficient condition on the 
number of measurements as a function of the SINR 7, activity 
probability A and number of users n. The condition (l28T l is 
somewhat technical, but is satisfied in the cases that interest 
us. The remainder of this section will discuss some of the 
implications of this theorem. 

C. Near-Far Resistance with Known Power Ordering 

First, suppose that the power ordering pj is known at 
the receiver so the receiver can detect the users in order of 
decreasing power. If, in addition, the SNRs of all the users go 
to infinity so that pj — > 00 for all j, then it can be verified 
that 7 > l/(An). In this case, the sufficiency of the scaling 
shows that 



m > (1 + 5)\nL{n, A) + An 

is sufficient for asymptotic reliable detection. This is 
identical to the lasso performance except for the factor 
L(A,n)/log((l - A)n), which lies in (0,4) for A € (0, 1/2). 
In particular, the minimum number of measurements does not 
depend on MAR; therefore, similar to lasso and OMP, SeqOMP 
can theoretically detect users even when they are much below 
the average power. 

With SeqOMP, simply knowing the order of powers is 
sufficient to achieve near-far resistance when the SNR is 
sufficiently high. Unlike for single-user detection, unequal 
received powers do not hurt the performance of SeqOMP, as 
long as the order of the powers are known at the receiver 
The feasibility of knowing the power ordering is addressed 
in Section IIV-II below. We will now look at the effect of the 
power profile on the performance. 

D. Performance with Constant Power 

Consider the case when all the powers pj are equal. To 
satisfy the constraint ( |25] |. the constant power level must be 
Pj = SNR/ (An). From ( l26l l. the minimum SINR is 7 = 7const, 
where 

_ SNR SNR 

" A(n+(n-l)SNR) An(l + SNR) ' ^ ^ 

and the approximation holds for large n. 

It can be verified that the constant power profile satisfies 
the technical condition ( |28] | provided A is bounded away from 
zero and the SNR does not grow "too fast". Specifically, the 
SNR must satisfy SNR = o(n/log(n)). In this case, we can 
substitute 7 = 7const in ( l29] l to obtain the condition 

(1 + 5)(1 + SNR)L(A,n) 



m > 



SNR 



-An + An 
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for asymptotic reliable detection. The condition is precisely 
the condition for single-user detection in ( fTSl ) with MAR = 1 
and an additional Xn term. 

Thus, for a constant power profile, Theorem[T]does not show 
any benefit in using SeqOMP. 

E. Optimal Power Shaping 

The constant power profile, however, is not optimal. Sup- 
pose that accurate power control is feasible so that the receive 
power levels pj can be set by the receiver. In this case, we can 
maximize the SINR 7 in ( |26] | for a given total SNR constraint 
dZST l. It is easily verified that any power profile pj maximizing 
the SINR 7 in (|26l) will satisfy 



= 7 I 1 + A ^ Pj 



(31) 



for all ^ = 1, . . . , n. The solution to ( [3TT i and (|25T l is given by 



Pi = 7(1 + A7)'' 



where 7 = 7opt is the SINR, 



1 r 



7opt = J 



(1 + SNR)i/" - 1 



l-log(l 
An 



(32) 



SNR). (33) 



Here, the approximation holds for large n. Again, some 
algebra shows that, when A is bounded away from zero, the 
power profile pj in (|32] | will satisfy the technical condition 
(|28]l when log(l + SNR) = o{n/\og{n)). 

The power profile (l32T i is exponentially decreasing in the 
index order £. Thus, users early in the detection sequence 
are allocated exponentially higher power than users later in 
the sequence. This allocation insures that early users have 
sufficient power to overcome the interference from all the users 
later in the detection sequence that are not yet cancelled. This 
power shaping is analogous to the optimal power allocations 
in the classic MAC channel when using a SIC receiver [5]. 

The ratio of the optimal SINR 7opt in (|33]l to the SINR 
with a constant power profile, 7const in ( l30l l is given by 

7opt 



(1 + SNR) log(l + SNR) 



Tconst 



SNR 



This ratio represents the potential increase in SINR with 
exponential power shaping relative to the SINR with equal 
power for all users. The ratio increases with SNR and can be 
large when the SNR is high. For example, when SNR =10 dB, 
7opt/7const ~ 2.6. When SNR = 20 dB, the gain is even 
higher at 7opt/7const ~ 4.7. 

Based on Theorem [l] this gain in SINR will result in a pro- 
portional decrease in the minimum number of measurements. 
Specifically, if we substitute the SINR 7opt in (|33] | into ( |29] l. 
we see that that the condition 



m > 



(! + <?)£(», A) 
log(l + SNR) 



An + An 



(34) 



is sufficient for SeqOMP to achieve asymptotic reliable detec- 
tion of the active users, when the users use exponential power 
shaping ( |32] |. 



As before, if A < 1/2, we can bound L{n, A) < 41og(n(l- 
A) and the sufficient condition ( [34b can be simplified to 



m > 



4(1 + 6) log(n(l-A)) 



An + An, 



(35) 



log(l + SNR) 

the leading term of which appears in TableUwith the 6 omitted. 

F. SNR Saturation 

As discussed earlier, a major problem with both single- 
user detection and lasso multiuser detection was that their 
performance "saturates" with high SNR. That is, even as the 
SNR scales to infinity, the minimum number of measurements 
scales as m = 0(Anlog((l — A)n). In contrast, optimal ML 
detection can achieve a scaling m = 0(An), when the SNR 
is sufficiently high. 

An important consequence of ([34]) is that SeqOMP with 
exponential power shaping can overcome this bound. Specif- 
ically, if we take the scaHng of SNR ~ 6(An) in ( [35l ) 
and assume that A is bounded away from zero we see that 
asymptotically, SeqOMP requires only 



> 5Xr 



(36) 



measurements. In this way, unlike single-user and lasso detec- 
tion, SeqOMP is able to obtain the scaling m = 0{Xn) when 
the SNR ^ 00. 

G. Power Shaping with Sparse Bayesian Learning 

The fact that power shaping can provide benefits when 
combined with certain iterative detection algorithms confirms 
the observations in the work of Wipf and Rao [30]. That 
work considers signal detection with a certain sparse Bayesian 
learning (SBL) algorithm. They show the following result: 
Suppose X has k non-zero components and Pi, i — 1,2, ... ,k, 
is the power of the ith largest component. Then, for a given 
measurement matrix A, there exist constants Vi> 1 such that 
if 

Pi > ViPi-i, (37) 
the SBL algorithm will correctly detect the sparsity pattern of 

X. 

The condition ( [37l ) shows that a certain growth in the powers 
can guarantee correct detection. The parameters Vi however 
depend in some complex manner on the matrix A, so the 
appropriate growth is difficult to compute. They also provide 
strong empirical evidence that shaping the power with cer- 
tain profiles can greatly reduce the number of measurements 
needed. 

The results in this paper add to Wipf and Rao's observations 
showing that growth in the powers can also assist sequential 
OMP. Moreover, for the SeqOMP case, we can explicitly 
derive the optimal power profile for certain large random 
matrices. 

This is not to say that SeqOMP is better than SBL. In fact, 
empirical results in [34] suggest that SBL will outperform 
OMP, which will in turn do better than SeqOMP. As we have 
stressed before, the point here of analyzing SeqOMP is that 
we can easily derive concrete analytic results. These results 
may provide guidance for more sophisticated algorithms. 
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H. Robust Power Shaping 

The above analysis shows certain benefits of SeqOMP used 
in conjunction with power shaping. However, these gains are 
theoretically only possible at infinite block lengths. Unfortu- 
nately, when the block length is finite, power shaping can 
actually reduce the performance. 

The problem is that when an active user is not detected in 
SeqOMP, the user's energy is not cancelled out and remains as 
interference for all subsequent users in the detection sequence. 
With power shaping, users early in the detection sequence 
have much higher power than users later in the sequence, so 
missing an early user can make the detection of subsequent 
users difficult. At infinite block lengths, the probability of 
missing an active user can be driven to zero. But, at finite 
block lengths, the probability of missing an active user early in 
the sequence will always be nonzero, and therefore a potential 
problem with power shaping. 

The work [36] observed a similar problem when SIC is used 
in the CDMA uplink. To mitigate the problem, [36] proposed 
to adjust the power allocations to make them more robust to 
decoding errors early in the decoding sequence. The same 
technique, which we will call robust power shaping, can be 
applied to the SeqOMP as follows. 

In the condition (|3T1 ). it is assumed that all the energy of 
users with index j < £ have been correctly detected and 
subtracted. But, following [36], suppose that on average some 
fraction 6* e [0, 1] of the energy of users early in the detection 
sequence is not cancelled out due to missed detections. We 
will call 6 the leakage fraction. With nonzero leakage, the 
condition ( l3Tl i would be replaced by 

(e-i n \ 

3 = 1 3=1+1 ) 

For given SNR, 6 and A, the linear equations dZST l and dJST l 
can be solved to obtain the optimal power profile, given by 



(1 - 61)7 / 1 + A7 



where 7 



1 + A07 Vl + A6'7 
7(6*), the optimal SINR 



(39) 



7(0) - 



1 



1 + SNR 



1 



An(l - 61) 



1 



log 



6* SNR 

1 



l/n 



1 



SNR \ 
rSNR j 



(40) 



The approximation here is valid for large n. 

Fig. [T] plots the SINR, 7(0), as a function of the leakage 
fraction 9. The SINR is plotted relative to 7const in ( l30] l, 
which is the SINR that one obtains with a constant power 
profile. The increase in SINR is maximized when the leakage 
fraction, 9 = 0. When 6* = 0, 7(6*) = 7e^p, the SINR ^ for 
the exponential power shaping. This is the optimal SINR, but 
assumes that there are no missed detections. 

As the leakage fraction 9 is increased, the SINR, 7(6'), 
decreases, which is price for the robustness to missed detec- 
tions. In the limit as 9 ^ 1, the optimal power profile, pj in 
approaches a constant and the corresponding SINR, 7(6*), 
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Fig. 2. Robust power shaping: Power profiles for various leakage values 
d. For all the curves, the number of users is n = 100, SNR = 20 dB, and 
activity probabiUty is A = 0.1. 



converges to 7const- However, even at a reasonable leakage 
fraction, say 9 = 0.1, the SINR ^{9) can still be significantly 
larger than 7const- 

It is illustrative to actually look at the optimal power profiles 
as a function of 9. Fig. |2] plots the optimal power profile, pj 
in (|39] l, for leakage values of = 0, 0.1 and 1. In the plot, 
n = 100, SNR = 20 dB, and A 0.1. It can be seen that 
when = 0, there is a large range of almost 20 dB in the 
target receive powers from the first to last user. While this 
power profile it optimal when there are no missed detections, 
the power allocations can be very damaging if an active user 
is missed. In an extreme case, for example, if the first user 
is active but not detected and not cancelled it will cause an 
interference level 20 dB above the signal level of the last user 
As the leakage fraction 9 is increased, the range of powers is 
decreased, which improves the robustness to missed detection 
at the expense of reduced SINR. 
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/. Practical Power Control Considerations 

In the original description of the problem in Section HIl we 
said that we would restrict our attention to estimators that do 
not require a priori knowledge of the modulation vector x. 
However, although SeqOMP does not require knowledge at 
the receiver of the channel phases, the above analysis shows 
that knowledge of the order of the conditional received powers 
is necessary to achieve near-far resistance. Additionally, elim- 
inating the self-noise limit requires that powers are explicitly 
targeted to a certain profile. 

The use of power control for on-off random access com- 
munication requires some justification. On-off random access 
signaling is most likely to be used when the users do not 
already have some ongoing communication. For example, in 
cellular systems, it is used for initial access or requests to 
transmit. If the users were already transmitting, the one bit 
could be embedded in the other communication and on-off 
random access signaling would not be needed. Consequently, 
fast feedback power control would likely not be available for 
such on-off random access transmissions since the users are 
not likely to have a continuous transmission to measure the 
received power 

Thus, in practice, power control is likely achievable only 
by open-loop methods. Open-loop power control is used for 
example in cellular systems where each mobile estimates 
the path loss in the downlink and adjusts its access power 
appropriately in the uplink. Open-loop power control is most 
accurate when the uplink and downlink are time-division 
duplexed (TDD) in the same band. 



for a given desired false alarm probability. 

In the simulations below, we will run the algorithms with 
a fixed false alarm probability (typically ppA = lO^"^), and 
measure the missed detection rate given by 

PMD =Pr(j<^i\je /true) ■ 

The missed detection rate will be averaged over all j e /true- 

B. Evaluation of Bounds 

We first compare the actual performance of the SeqOMP 
algorithm with the bound in Theorem [T] Fig. [3] plots the 
simulated missed detection probability for using SeqOMP at 
various SNR levels, activity probabilities A, and numbers of 
measurements m. In all simulations, the number of users was 
fixed to n = 100 and the users arrived at equal power (MAR = 
1). The false alarm probability was set to ppA = 10^'^. The 
robust power profile of Section IIV-HI is used with a leakage 
fraction 9 = 0.1. 

The dark line in Fig. [3] represents the number of measure- 
ments TO for which Theorem [T] would theoretically guarantee 
reliable detection of the active user set at infinite block lengths. 
To apply the theorem, we used the SINR 7 — 7(6*) in ( |40| |. 
At the block lengths considered in this simulation, the missed 
detection probability at the theoretical sufficient condition is 
small, typically between 2 and 10%. Thus, even at moderate 
block lengths, the theoretical bound in Theorem[T]can provide 
a good estimate for the number of measurements for reliable 
detection. 



V. Numerical Simulation 

A. Threshold Settings 

The performance of the single-user detection and SeqOMP 
algorithms depend on the setting of the threshold level /i. 
In the theoretical analysis of Theorem [T] an ideal threshold 
is calculated assuming infinite block lengths that guarantees 
perfect detection of the active user set. However, in simulations 
with finite block lengths, it is more reasonable to set the 
threshold based on a desired false alarm probability. A false 
alarm is the event when the algorithm falsely detects that a user 
is active when it is not. For the single-user detection algorithm 
in Section IIII-CI or the SeqOMP algorithm in Section IIV-AI 
the false alarm probability is 

PFA = Pr e / I j ^ /true) 

= Pr (pU) > M I J ^ ^truo) , 

which is the probability that the correlation p{j) exceeds the 
threshold p when the user j is not active. 

It is shown in the proof of Theorem [T] that, when j ^ Itmc, 
p{j) follows a Beta B{2,2{m — 1)) distribution. When to is 
large, this beta distribution is approximately Rayleigh and the 
false alarm probability is given by 

PFA ~ exp(-/iTO). 

Thus, the threshold level p can be set to 

^ = -log(pFA)/m 



C. SeqOMP vs. Single User Detection 

Fig. m shows a more direct comparison of the performance 
of single-user detection and SeqOMP with power shaping. In 
the simulation, there are n = 100 users, the activity probability 
is A = 0.1, and the total SNR is 20 dB. The number of 
measurements to was varied, and for each to, the missed 
detection probability was estimated with 1000 Monte Carlo 
trials. 

As expected, single-user detection requires the most number 
of measurements. For a missed detection rate of 1%, Fig. |4] 
shows that single-user detection requires approximately to k, 
210 measurements. In this simulation of single-user detection, 
all users arrived at the same power Employing SeqOMP, but 
keeping the power profile of the users constant, decreases the 
number of measurements somewhat to to « 170 for a 1% 
missed detection rate. However, using SeqOMP with power 
shaping decreases the number of measurements by more than 
a factor of two to to w 95. Thus, at least at high SNRs, 
SeqOMP may provide significant gains over simple single- 
user detection. 

D. OMP with Power Shaping 

As discussed earlier, although SeqOMP can provide gains 
over single-user detection, its performance is typically worse 
than OMP, even if SeqOMP is used with power shaping. 
Our interest in the algorithm is that it is simple to analyze. 
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Fig. 3. SeqOMP with power shaping: Each colored bar represents the SeqOMP algorithm's missed detection probability as a function of the number of 
measurements m, with different bars showing different activity probabilities A and SNR levels. The missed detection probabilities were estimated with 1000 
Monte Carlo trials. The number of users is set to n = 100, the false alarm probability is ppA = lO""^. The power shaping is performed with a leakage 
fraction of 9 = 0.1. The dark black line shows the theoretical number of measurements m required in Theorem [T] with 7 = 7(6) in )40t . 



However, we can in principle use power shaping with the better 
OMP algorithm as well. 

While we do not have any analytical result, the simulation 
in Fig. |5] shows that power shaping provides some gains with 
OMP as well. Specifically, when the users are targeted at equal 
receive power, m « 85 measurements are needed for a missed 
detection probabiUty of 1%. This number is slightly lower than 
that required by SeqOMP, even when SeqOMP uses power 
shaping. When OMP is used with power shaping, the number 
of measurements decreases to about m w 65. 

VI. Relations to MAC Capacity 

As discussed in the introduction, the random access channel 
is a special case of a multiple access channel (MAC). One of 
the fundamental results in network information theory [4], [5] 
is that, under certain assumptions, the sum rate with multiple 
users transmitting to a single receiver without coordination 
can equal the capacity with coordination. However, one of the 
key assumptions in this classic result is that the users employ 
capacity-achieving block codes. In the on-off random access 
channel considered here, users transmit on a single codeword 
and therefore cannot benefit from channel coding. Thus, unlike 
the classic MAC channel, the random access channel may 
incur a loss in capacity due to the lack of coordination amongst 
users. 

To evaluate this possibility, let us first compute the effective 
"sum rate" transmitted in the on-off random access channel. 



Each user transmits with a probability A, so the information 
conveyed in detecting the user's activity is h{X), where h{X) 
is the binary entropy, 

h{X) = -Alog(A) - (1 - A)log(l - A) nats. 

Since there are n users, if all users can be reliably detected, 
the total information rate is 

R = nh{X). 

We can compare this rate with the Shannon capacity of 
the channel. If all the users coordinate their transmissions, 
the capacity would be identical to a single user transmitting 
with the same total power Since the channel is AWGN with 
m channel uses, the capacity of the channel with a single 
coordinated transmission would be C = mlog(l + SNR). 
If the number of measurements m is selected for reliable 
detection, the necessary condition ( fT2] i shows that the capacity 
is bounded below by 

C = TOlog(l + SNR) > lQg(l + SNR) Xnlog{{l - X)n). 

SNR 

Thus, the ratio of the sum rate to capacity is bounded above 
by 

R log(l + SNR)/i(A) 
C - SNRlog((l - X)n)' 

This ratio represents a bound on the maximum rate without 
coordination amongst the users to the maximum rate possible 
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Num measurements, m 



Fig. 4. Missed detection probabilities for various detection methods and 
power profiles. The number of users is n = 100, SNR = 20 dB, the activity 
probability is A = 0.1, and the false alarm rate is ppA = 10~^. For the 
SeqOMP algorithm with power shaping, the leakage fraction was set to 6 = 
0.1. 




10"^ I < < < < < • < 1 
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Num measurements, m 

Fig. 5. Power shaping with OMP. Plotted is the missed detection probabilities 
with OMP using a constant power profile, and power shaping wiht a leakage 
fraction set to S = 0.1. Other simulation assumptions are identical to Fig. |4] 

with coordination. If A and the SNR are fixed and n ^ oo, 
the ratio R/C 0. Thus, the sum rate of the random access 
channel has a fundamentally lower scaling than the standard 
AWGN channel. 

There is, however, one case where the random access 
channel's sum rate achieves the single-user Shannon capacity. 
Suppose that the SeqOMP algorithm is used with exponential 
power shaping. The sufficient condition ( [34l i shows that the 
number of measurements m can be selected such that the 
Shannon capacity is 

C = TOlog(l + SNR) « \nL{X,n), 

where, in the approximation, we have ignored the infinitesimal 



5, and the An term. In this case, the ratio of the sum rate to 
capacity is 

R _ h{X) 
C XL{X,n)' 

Now suppose the expected number of active users is fixed 
to some value k, and we let the activity probability scale as 
X{n) = k/n. It is easily checked that R/C ^ 1 as n oo. 
Therefore, with a fixed expected number of active users, the 
sum rate of the random access channel matches the Shannon 
capacity as the number of user n ^ oo. Moreover, the random 
access capacity can be achieved with the SeqOMP method 
with exponential power shaping. 

In a way, this result is perhaps not surprising. When the 
expected number of used is fixed to some value k, and 
the block length m scales to infinity, the random access 
channel becomes identical to a standard MAC channel with 
k users, each transmitting on a random codebook of size n/k. 
Moreover, the SeqOMP algorithm is precisely equivalent to 
the classic SIC used in conjunction with ML detection for 
each user. SIC combined with optimal decoding for each user 
is known to achieve the sum rate. 

The connection between the MAC channel and sparsity 
detection has also been observed by Jin and Rao [29]. Specifi- 
cally, they show that OMP is clearly an analogue to the classic 
SIC method. Moreover, they argue, at least heuristically, that 
if A <C 1, the sum-rate R achievable by OMP should approach 
the capacity C. 

Our analysis of SeqOMP provides analytic evidence for 
these claims by showing a specific regime where R/C 1. 
However, it also shows when this intuition fails by showing 
that when the SNR and activity probability A are fixed, then 
R/C 0. In this case, there is a potentially-large gap between 
the MAC capacity and the sum rate in the random on-off 
channel. 

VII. Conclusions 

Sparse signal detection is a valuable framework for under- 
standing multiple access on-off random signaling. Results can 
provide simple capacity estimates and clarify the role of power 
control and multiuser detection. Methods such as OMP and 
lasso, which are widely used in sparse detection problems, can 
be applied as multiuser detection methods for on-off random 
access channels. Analysis shows that these methods may offer 
improved near-far resistance over single-user detection in high 
SNRs. Optimal ML detection may theoretically offer further 
gains in the high SNR regime, but is not computationally 
possible. However, some gains at high SNR may be practically 
achievable through power shaping and SIC-like techniques 
such as OMP. 

Appendix 
Proof of Theorem[T] 

A. Proof Outline 

At a high level, the proof of Theorem [T] is similar to the 
proof of [15, Thm. 2], the single-user detection condition dTsl l. 
One of the difficulties in the proof is to handle the relationships 
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between random events at different iterations of the SeqOMP 
algorithm. To avoid this difficulty, we first show an equivalence 
between the success of SeqOMP and an alternative sequence of 
events that is easier to analyze. After this simplification, small 
modifications handle the cancellations of detected vectors. 
Fix n and define 

/true(j)={ ^ : ieltrue,i<3}, 

which is the set of elements of the active set with indices 
£ < j. Observe that /truc(O) = {0} and /true(n) = /true- 
Let Ptiuc(i) be the projection operator onto the orthogonal 
complement of {a^, £ e /tiuc(j — 1)}, and define 



Ptruc(j) 



=(j)yp 



|Ptruo(i)aj| 



=(j)y|| 



(41) 



A simple induction argument shows that Algorithm [T] correctly 
detects the elements in the active set if and only if, at each 
iteration j, the variables /(j), P(j) and p{j) defined in 
the algorithm ai-e equal to luudj), Ptruc(i) and puudj), 
respectively. Therefore, if we define 



I = { j ■■ Ptruo(j) > Ai } . 



(42) 



then Algorithm [T] correctly detects all users if and only if 
I = hruc- In particular. 



Pcir(n) = Pr ( / 7^ Iti 



To prove that Pcri{n) ^ it suffices to show that there 
exists a sequence of threshold levels fi{n) such the following 
two limits 



T • f • Ptruc ij) -t 

hmmf mm — > 1, 

n^oo je/truo(n) fi 
r PtrucO') ^ 

limsup max < 1, 



(43) 
(44) 



hold in probability. The first limit (l43T l ensures that all the 
components in the active set will not be missed and will be 
called the zero missed detection condition. The second limit 
(l44l l ensures that all the components not in the active set will 
not be falsely detected and will be called the zero false alarm 
condition. 

Set the sequence of threshold levels as follows. Since 5 > Q, 
we can find an e > such that 



{l + 5)>{l + ef. 

For each n, let the threshold level be 

log(n(l-A)) 



M=(l + e)- 



m — An 



(45) 



(46) 



The asymptotic lack of missed detections and false alarms 
with these thresholds are proven in Appendices |D] and IE] 
respectively. In preparation for these sections. Appendix |B] 
reviews some facts concerning tail bounds on Chi-squared 
and Beta random variables and Appendix |C] performs some 
preliminary computations. 



B. Chi-Squared and Beta Random Variables 

The proof requires a number of simple facts concerning 
chi-squared and beta random variables. These variables are 
reviewed in [37]. We will omit or just provide some sketches 
of the proofs of the results in this section since they are all 
standard. 

A random variable u has a chi-squared distribution with 
r degrees of freedom if it can be written as u = X]i=i -^f ' 
where Zi are i.i.d. Af{0,l). If u is a chi-squared with two 
degrees of freedom, the random variable v = u/2 has a 
Rayleigh distribution. For this work, chi-squared and Rayleigh 
distributed random variables arise in two important instances. 

Lemma 1: Suppose x e C has a complex Gaussian distri- 
bution C7V(0, a'^Ir). Then: 

(a) 2||x||^/cr^ is chi-squared with 2r degrees of freedom; and 

(b) if y is any other r-dimensional random vector that is 
nonzero with probability one and independent of x, then 
the variable 



a2||y||2 



has a Rayleigh distribution. 

Proof: Part (a) follows from the fact that the norm 
2||x|P/ct^ is a sum of squares of 2r unit-variance Gaussian 
random variables, one for each component of •\/2/ctx. Part 
(b) follows from the fact that x'y/(||y||cr) is a unit-variance 
complex Gaussian random variable. ■ 

The following two lemmas provide standard tail bounds. 

Lemma 2: Suppose that for each n, {x^"''}"^j^ is a set of 

in) 

complex Gaussian random vectors with each x^ ' spherically 
symmetric in an (n)-dimensional space. The variables may 

(n)||2 



be dependent. Suppose also that E||x^ 



1 and 



where 



Then the limits 



lim log(7i)/TO„iin(n) = 



1 

(n) = min mj{n). 

j = l,...,n 



lim max ||x^"''||^ = lim min ||x';"-'||^ — 1 



n — *oo j — l,...,n 



n— >oo j = l,...,n 



hold in probability. 

Proof: From Lemma [1] for every j and n, the norms 

zij,n) = 2TOj(n)||xj"^||^, 

are chi-squared random variables with 2mj{n) degrees of 
freedom. A standard tail bound (see, for example [14]), shows 
that for any e > 0, 

Pr ( ^^'''^\ > 1 + e ) < cxp(-2emj (n)) 



2171 j (n) 



< exp(-2emmin(n)) 



where the last step is due to the fact that mj{n) > mjnin{n). 
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So, using the union bound. 



Pr ( max ||x^"^||^ > 1 + e 

yj = l,...,n 

Pr ( max , \ > 1 + e 



j=i,...,n 2mj{n) 



> 1 



< n max Pr , _ 

j=i,...,n \2mj(n) 

< nexp(— 2em,nin('T-)) 

= exp(-2em,nin(") + log(n)) 0, 

where the last step is due to the fact that log(n) /mmin(n) 0. 
This shows that 

limsup max ||xj"^|p < 1 

in probabiUty. 

Similarly, using the tail bound that 

one can show that 

liminf min ||x^"^|p 

n — >(yo j — l.....n ^ 



< 1 — e j < cxp(— e mj(n)), 



> 1 



in probability, and this proves the lemma. ■ 
Lemma 3: Suppose that for each n, is a set of 

Rayleigh random variables. The variables may be dependent. 
Then 



,(") 



lim sup max 



< 1, 



(47) 



j=i,...,n log(n) 
where the limit is in probability. 

(n) 

Proof: Since each ' is Rayleigh, for any /i > 0, 

Pr(u5"^ > = e'^- 

Combining this with the union bound, we see that for any 

e > 0, 



/ («) 
/ " 
Pr max - — 



>il + e) 



< nexp(— (1 + e) log(n)) = n 



0. 



This proves the limit ( |47T i. ■ 
The final two lemmas concern certain beta distributed 
random variables. A real-valued scalar random variable w 
follows a Beta(r, s) distribution if it can be written as w = 
Ur/{ur + Vg), where the variables Ur and Vg are independent 
chi-squared random variables with r and s degrees of freedom, 
respectively. The importance of the beta distribution is given 
by the following lemma. 

Lemma 4: Suppose x and y are independent random r- 
dimensional complex random vectors with x being spherically- 
symmetrically distributed in and y having any distribution 
that is nonzero with probability one. Then the random variable 

Ix'yP 



|x|P|!y||2 



is independent of x and follows a Beta(2, r — 2) distribution. 



Proof: This can be proven along the lines of the 
arguments in [38]. ■ 

The following lemma provides a simple expression for the 
maxima of certain beta distributed variables. 

Lemma 5: For each n, suppose is a set of 

random variables with wj"-* having a Beta(2, mj(n) — 2) 
distribution. Suppose that 

lim log(n)/mmin(f^) = 0, lim TOmm('^) = oo (48) 

where 



Then, 



(n) ^ min m,j{n). 

j=l,...,n 



limsup max w) < 1 

n~.oo i=i,---," log(n) J 



in probability. 

Proof: We can write w^"' = u["V(wl"'' + f!""*) where 

(n) (n) J ' W J ' 

Uj and Vj are independent chi-squared random variables 
with 2 and rrij (n) — 2 degrees of freedom, respectively. Let 



Un - 
Vn = 



— — -— max u 
21og(n) i=i,...,n ■' 

1 



„(") 



j=i,...,n 2mj{n) — 2 



mj{n)w 



j=i,...,ri log(n) 

The condition (1481 ) and an argument similar to the proof of 
Lemma |2] shows that V^j 1 in probability. Also, [/„/2 is 
Rayleigh distributed so Lemma [3] shows that 

limsupC/„ < 1 

n — ^oo 

in probability. Using these two limits along with (l48T l shows 
that 



lim sup T„ = limsup max 



mj{n)w) 



3=1.. ..,n log(n) 



,(«) 



mj{n) Uj 
= limsup max - — r^—r^ 

„^oo log(n) + ^(") 



= lim sup 



< 



Vn + Un \og{n)/mj{n) 

1 

1, 



1 + (1)(0) 

where the limit is in probability. ■ 

C. Preliminary Computations and Technical Lemmas 

We first need to prove a number of simple but technical 
bounds. We begin by considering the dimension m; defined 
as 

m-i ^ dim(range(Ptruc («)))• (49) 

Our first lemma computes the limit of this dimension. 
Lemma 6: The following limit 



lim min — 

n^oo 1=1,. ...n TO — An 



1 



(50) 



FLETCHER, RANGAN AND GOYAL 



15 



holds in probability and almost surely. The deterministic limits 

limMM. n,,M(i_M.o (51) 

ji^oo m — An n— K>o m — An 

also hold. 

Proof: Recall that Ptruo(i) is the projection onto the 
orthogonal complement of the vectors with j E /tiuc(* — !)■ 
With probability one, these vectors will be linearly indepen- 
dent, so Ptruo(i) will have dimension m— |/truc(* ^ Since 
^tiuc(*) is increasing with i, 



min rrii = m— max |/truc(* — 1)| 

i— l,...,n i—l.....n 

= m - |/truc('^ - 1)1- 



(52) 



Since each user is active with probability A and the activities 
of the users are independent, the law of large numbers shows 
that 

lim '^"7^"',^)' = 1 
n^oo A(n — 1) 

in probability and almost surely. Combining this with (l52l i 
shows ( ISOl l. 

We next show dsTT l. Since the hypothesis of the theorem 
requires that An, (1 — A)n and m — An all approach infinity, 
the fractions in ( fSTT l are eventually positive. Also, from ( fTSI ). 
L(A,n) < max{log(An), log((l — A)n)}. Therefore, from 



1 



m — An 



max{log(An), log((l — A)n)} 



< 



7 



c{log(An),log((l-A)n)} <7^0, 



L(A,n) 

where the last step is from the hypothesis of the theorem. ■ 
Next, for each i ~ 1, . . . ,n, define the residual vector, 

Gi = Ptruo(i)(y - SL^Xi). (53) 

Observe that 

Gi Ptruc{i){y - aiXi) 



(a) 



= Ptruo(i) ( w + a^a; 



j>i 



(54) 



where (a) follows from ([T]i and (b) follows from the fact that 
Ptiuc(*) is the projection onto the orthogonal complement of 
the span of all vectors a.j with j < i and Xj ^ 0. 

The next lemma shows that the power of the residual vector 
is described by the random variable 



(55) 



j=i+i 



Lemma 7: For alH = 1, . . . , n, the residual vector e^, con- 
ditioned on the modulation vector x and projection Ptruc(*)' 
is a spherically symmetric Gaussian in the range space of 
Ptruc(*) with total variance 

^ ^)=^a^{^), (56) 



E lie,: 



where rnj and cr^(i) are defined in ( |49] ) and ( |55] ), respectively. 
Proof: Let 

Vi=w + YajXj, 

SO that ei = Ptruc(*)vi. Since the vectors a^ and w have 
Gaussian CJV{Q, l/mlm) distributions, for a given modulation 
vector X, must be a zero-mean white Gaussian vector 
with total variance E||vi|p — cr'^{i)- Also, since the operator 
Ptrue(*) is a function of the components xe and vectors a^ for 
£ < i, Ptiuc(*) is independent of the vectors w and aj, j > i, 
and therefore independent of v^. Since Ptruo(i) is a projection 
from an m-dimensional space to an -dimensional space, e^, 
conditioned on the modulation vector x, must be spherically 
symmetric Gaussian in the range space of Ptruc(*) with total 
variance satisfying (l56T l. ■ 

Our next lemma requires the following version of the well- 
known Hoeffding's inequality. 

Lemma 8 (Hoeffding's Inequality): Suppose z is the sum 



where zq is a constant and the variables z, are independent 
random variables that are almost surely bounded in some 
interval Zi G [a^ , bi] . Then, for all e > 0, 

Pr (z - E(z) > e) < exp 



C 



where 



Proof: See [39]. ■ 
Lemma 9: Under the assumptions of Theorem [T] the limit 



lim sup max 

n^ao i=l,...,n (T^ii) 



< 1 



holds in probability. 

Proof: Let z{i) = (j'^{i)/a'^{i). From the definition of 
cr^(i) in ( fSST l. we can write 



z{i) = 



1 



^ ' j=i+l 



where z{i,j) — \xj\'^ /a'^(i) for j > i. 

Now recall that in the problem formulation, each user is 
active with probability A, with power \xj\'^ = pj conditioned 
on when the user being active. Also, the activities of different 
users are independent, and the conditional powers pj are 
treated as deterministic quantities. Therefore, the variables 
z{i,j) are independent with 



_ J pj/d'^{i), with probabihty A; 



0: 



with probability 1 — A, 



for j > i. Combining this with the definition of in ( |27] ), 
we see that 



E(z(z)) = 



1 



a 
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Also, for each j > i, we have the bound 

So for use in Hoeffding's Inequality (Lemma [8]l, define 

n 

C^C{i,n)=d-\€) J2 pI 
j=i+i 

where dependence of the power profile and a{i) on n is 
implicit. Now define 

c„ = max log{n)C(i,n), 

z— l,...,n 

SO that C{i,n) < c„/log(n) for all i. Hoeffding's Inequality 
(Lemma [8]) now shows that for all i < n, 

Pr(z(i)>l + e) < cxp (-2eVC(i,7i)) 

< exp {-2e^ log(n)/c„) . 

Using the union bound, 



lim Pr max z{i) > 1 + e 

n^oo \j=l,...,n 



< lim n exp 



log(n) 



= lim n^-^eVc,. = 0. 

n — *oo 

The final step is due to the fact that the technical condition 
in the theorem implies c„ 0. This proves the lemma. 



D. Missed Detection Probability 

Consider any j £ /true- Using ( l53T l to rewrite dTTT i along 
with some algebra shows 

|a'Pt.uc(j)yP 



Aruo(j) 



||PwO>|P||Ptrue(j)y,|P 

|a^-(xj-Ptruc(j)aj +ej-)p 
||Ptruc(j>j|p||a;jPtruc(j)aj + ej||2 



> 



Sj + 2^z-s- + 1 ' 



where 



|x,f HPt,ue(j)a,-|P 
l|e,P 
|a;.Pt,uc(i)e,f 
||Pt™e(j)a,|P||e,|p- 



(57) 

(58) 
(59) 



Define 



Smin — mm So, Smax ~ max Zj. 



We will now bound Sniin from below and Smax from above. 

We first start with Smin- Conditional on x and Ptruc(j), 
Lemma Q shows that each is a spherically-symmetrically 
distributed Gaussian on the mj -dimensional range space of 
Ptruc(i)- Since there are asymptotically An elements in /true, 
Lemma |2] along with (BTl i show that 

m 



lim max 



-1, 



(60) 



where the limit is in probability. Similarly, Ptruc(j)aj is 
also a spherically-symmetrically distributed Gaussian in the 
range space of PtrucO)- Since Ptruc(i) is a projection 
from an m-dimensional space to a -dimensional space 
and E||aj||^ = 1, we have that E||Ptruc(i)aj||^ = rrij/m. 
Therefore, Lemma |2] along with (ISTT i show that 



lim min ||Ptruo(j)ej|p = 1. 

Taking the limit (in probability) of Smin, 



(61) 



lim inf ■ 





lim inf 

n — >co 


mill 


(a) 


lim inf 

n — *oo 


min 


(1) 


lim inf 

n — *oo 


min 

jG/tr„, 




lim inf 

n — >oo 


min 


(d) 
> 


lim inf 

n — >OQ 


min 


(e) 
> 


1, 





kjPl|Ptruc(j)aj- 

7l|e,|P 



Pj 



(62) 

where (a) follows from dSST i: (b) follows from (l60l i and (611 : 
(c) follows from ( |24] |; (d) follows from Lemma |9j and (e) 
follows from ( |26] |. 

We next consider Smax- Conditional on Ptruc(i), the vec- 
tors Ptruc(j)aj and are independent spherically-symmetric 
complex Gaussians in the range space of PtrucO)- It follows 
from Lemma |4] that each Zj is a Beta(2, rrij — 2) random 
variable. Since there are asymptotically An elements in Itmo, 
Lemma |5] along with ( fSOl l and (ISTl i show that 

m — Xn m — \n 

limsup:j — T— ^Smax = lim sup - — — — max zj < 1. 
n^oD log(An) „^oo log(An) jG/truc 

(63) 

The above analysis shows that for any j E /true, 
1 



lim inf min f^/s7 — \/z7 



(a) 




1 


> 


lim inf 






n — ^oo 








1 


> 


lim inf 






n — ^oo 




> 


lim inf 

n — *C30 





log(An) 



1 + S /log(An) 



l + S 



— An 



> lim inf 



1 + ^ 



n-^oo y (^m — Xn)fi 



y/L{X,n)- v/log(An) 



(.) ^.^.^ (l + ^)log(n(l-A)) 



= lim inf -i / — — — 

n^oo V 1 + e 

(/) 



> VlT> 



(64) 
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where (a) follows from the definitions of Smin and Smax; (b) 
follows from ( |62] i and ( |63] i; (c) follows from ( |29] ); (d) follows 
from ( fT6] ); (e) follows from ( |46] i; and (f) follows from ( |45] ). 
Therefore, starting with ( fSTT i. 



lim 1111 mm 

(a) 



1 2^ZjSj + Zj 



> liminf min 

rj->oo je/truo /i Sj + 2y/ZjSj + 1 



. f . 1 (V^ 

= limmi mm — - 



> lim inf min 



oo jehr^^ /i Sj + 2^ZjSj + 1 
1 + e 



n-+oo jeltiuc Sj + 2^ZjSj + 1 

1 + e 



> liminf min 

n^oa je/truc + 2^/57 + 1 

> liminf min ^ 

n^oo jG/truc Sniin + 2^Si„in + 1 

> limmi mm -^r 

n^OO je/truc (^4- I)"' 



where (a) follows from ( fSTl l: (b) follows from ( l64b : (c) follows 
from the fact that Zj E [0, 1] (it is a Beta distributed random 
variable); (d) follows from ( l62l ): and (e) follows from the 
condition of the hypothesis of the theorem that 7^0. This 
proves the first requirement, condition ( l43T l. 

E. False Alarm Probability 

Now consider any index j ^ /true- This implies that Xj = 
and therefore (l53b shows that 



Ptruc(j)y ej 



Hence from ( |4TI ). 

Arue(j) 



|a;.eP 



|Ptrue(.?)a||2||e,| 



(65) 



where Zj is defined in (|59] l. From the discussion above, 
each Zj has the Beta(2, nij — 2) distribution. Since there are 
asymptotically (1 — \)n elements in I^^.^^, the conditions (ISOl l 
and ( BTT i along with Lemma |5] show that the limit 



m — Xn 

lim sup max - — — — -Zj < 1 

n-oo J^-ftruc log(?i(l - An) 



(66) 



holds in probability. Therefore, 
limsup max -ptruo(j) 

(a) ,. 1 

— limsup max —Zj 



(6) 



< 



TO — An 



limsup max — — ; — ; — 

„^oo i^Aruc (1 + e) log(n(l - An) 

1 



1 + e 



where (a) follows from ( |65] i; (b) follows from ( |46] i; and (c) 
follows from (|66] l. This proves (l44l i and thus completes the 
proof of the theorem. 
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